High accuracy acoustic modeling using two-level decision-tree based state-tying
نویسندگان
چکیده
Phonetic decision-tree based acoustic modeling has been widely used in speech recognition systems. However, the assumption that all states clustered in the same leaf node share both their Gaussians and mixture weights restricts the improvement of the acoustic models. In this paper, we propose a new structure called a two-level decisiontree. With this structure we can make better use of training data and improve the model accuracy and robustness. Two-level decision trees provide more flexibility to control the number of parameters. By tuning the balance of the first and second level tree nodes, we can get better performance with even fewer parameters than the traditional decision-tree based approach. Experiments on the Wall Street Journal tasks show that our approach can achieve about a 10% word error rate reduction over the conventional approach.
منابع مشابه
Robust decision tree state tying for continuous speech recognition
In this paper, methods of improving the robustness and accuracy of acoustic modeling using decision tree based state tying are described. A new two-level segmental clustering approach is devised which combines the decision tree based state tying with agglomerative clustering of rare acoustic phonetic events. In addition, a unified maximum likelihood framework for incorporating both phonetic and...
متن کاملHigh accuracy acoustic modeling based on multi-stage decision tree
In many continuous speech recognition systems based on HMMs, decision tree-based state tying has been used for not only improving the robustness and accuracy of context dependent acoustic modeling but also synthesizing unseen models. To construct the phonetic decision tree, standard method has used just single Gaussian triphone models to cluster states. The coarse clusters generated using just ...
متن کاملHigh resolution decision tree based acoustic modeling beyond CART
In this paper, an m-level optimal subtree based phonetic decision tree clustering algorithm is described. Unlike prior approaches, the m-level optimal subtree in the proposed approach is to generate log likelihood estimates using multiple mixture Gaussians for phonetic decision tree based state tying. It provides a more accurate model of the log likelihood variations in node splitting and it is...
متن کاملAcoustic modeling and language modeling for cantonese LVCSR
This paper describes our recent work on the development of a large-vocabulary, speaker-independent continuous speech recognition system for Cantonese (a major Chinese dialect). Both acoustic modeling and language modeling are being addressed. For acoustic modeling, we focus on right-context-dependent sub-syllable units. Tying of HMM at model as well as state level is applied based on phonetic k...
متن کاملDecision tree state tying based on penalized Bayesian information criterion
In this paper, an approach of penalized Bayesian information criterion (pBIC) for decision tree state tying is described. The pBIC is applied to two important applications. First, it is used as a decision tree growing criterion in place of the conventional approach of using a heuristic constant threshold. It is found that original BIC penalty is too low and will not lead to compact decision tre...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1999